Actor-independent action search using spatiotemporal vocabulary with appearance hashing
نویسندگان
چکیده
Human actions in movies and sitcoms usually capture semantic cues for story understanding, which offer a novel search pattern beyond the traditional video search scenario. However, there are great challenges to achieve action-level video search, such as global motions, concurrent actions, and actor appearance variances. In this paper, we introduce a generalized action retrieval framework, which achieves fully unsupervised, robust, and actor-independent action search in large-scale database. First, an Attention Shift model is presented to extract human-focused foreground actions from videos containing global motions or concurrent actions. Subsequently, a spatiotemporal vocabulary is built based on 3D-SIFT features extracted from these human-focused action regions. These 3D-SIFT features offer robustness against rotations and viewpoints. And the spatiotemporal vocabulary guarantees our search efficiency, which is achieved by inverted indexing structure with approximate nearest-neighbor search. In the online ranking, we employ dynamic time warping distance to handle the action duration variances, as well as partial action matching. Finally, an appearance hashing strategy is presented to address the performance degeneration caused by divergent actor appearances. For experimental validation, we have deployed actor-independent action retrieval framework in 3-season ‘‘Friends’’ sitcoms (over 30h). In this database, we have reported the best performance ðMAP@140:53Þ with comparisons to alternative and state-of-the-art approaches. Crown Copyright & 2010 Published by Elsevier Ltd. All rights reserved.
منابع مشابه
Learning Vocabulary-Based Hashing with AdaBoost
Approximate near neighbor search plays a critical role in various kinds of multimedia applications. The vocabulary-based hashing scheme uses vocabularies, i.e. selected sets of feature points, to define a hash function family. The function family can be employed to build an approximate near neighbor search index. The critical problem in vocabulary-based hashing is the criteria of choosing vocab...
متن کاملAction recognition with appearance-motion features and fast search trees
In this paper we propose an approach for action recognition based on a vocabulary of local motion-appearance features and fast approximate search in a large number of trees. Large numbers of features with associated motion vectors are extracted from video data and are represented by many trees. Multiple interest point detectors are used to provide features for every frame. The motion vectors fo...
متن کاملA Practical Dramaturgy for Actors through Theatrical Production Procedures
Dramaturgy, as a creative and critical act dependent on the theoretical and practical knowledge of theater, consists of two parts with Greek roots: Drama (action) and Ourgia (work and operation). Ourgia literally means to practice, act, and in other words process a raw material. Eugenio Barba divides Dramaturgy in to three parts: Actor Dramaturgy, Director Dramaturgy and Audience Dramaturgy. Th...
متن کاملClustering is Efficient for Approximate Maximum Inner Product Search
Efficient Maximum Inner Product Search (MIPS) is an important task that has a wide applicability in recommendation systems and classification with a large number of classes. Solutions based on locality-sensitive hashing (LSH) as well as tree-based solutions have been investigated in the recent literature, to perform approximate MIPS in sublinear time. In this paper, we compare these to another ...
متن کاملGuess Where? Actor-Supervision for Spatiotemporal Action Localization
This paper addresses the problem of spatiotemporal localization of actions in videos. Compared to leading approaches, which all learn to localize based on carefully annotated boxes on training video frames, we adhere to a weakly-supervised solution that only requires a video class label. We introduce an actor-supervised architecture that exploits the inherent compositionality of actions in term...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition
دوره 44 شماره
صفحات -
تاریخ انتشار 2011